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Abstract 

Facing worldwide challenges associated with multifactorial etiology of breast cancer, designing of 
combinatorial therapies using natural compounds is currently the emergent way of treating several cancers 
including breast cancer in a synergistic way, which may mitigate several problems associated with multiple 
receptor targeting. In this research, Estrogen receptor positive breast cancer was taken as prototype and several 
key receptors associated with this particular disease were targeted by virtual screening of natural compounds 
found in Indian originated medicinal plants using Computer aided Drug Designing (CADD) strategies. We 
found the combination of Carpusin, Paulownin Cornigerine, Nororientaline, Oryzalexin B, Romucosine H and 
Colchicine as effective against six potential receptors i.e. FGFR2, ESRI, PIK3CA, PIK3CB, PIK3CD and AR 
in Estrogen receptor positive breast cancer with their binding energies in the range of AG < -8.0 Kcal/mol as 
well as significant number of common amino acid binding residues as compared with binding sites of 
receptors. Thus this research holds significant implications for the designing of combinatorial therapeutic 
agents against breast cancer which can be further tested in-vitro and in-vivo to prove their synergistic 
efficiency. 

Key words: CADD, FGFR2, ESR1, Combinatorial therapies, PIK3, AR 


1. INTRODUCTION 

The occurrence of cancer is related to several factors to which human expose, the complexity of 
which is an arduous challenge in the field of epidemiology due to which many epidemiologists around the 
world have been understanding and determining cancer's multifactorial etiology. [1] 

The complex interplay of several genetic and non-genetic factors have been clearly assessed in 
several researches that predispose to development of cancer. In this research, breast cancer was taken as a 
prototype due to its prevalence at a large scale worldwide and in major areas of India. Breast cancer has been 
described as multifactorial in occurrence in several epidemiological studies in which 90% cases occur due to 
the involvement of non-genetic factors, like environmental factors, major biological, chemical carcinogenic 
agents. [2]—[4] 

To date, many researches have shown that a dysregulation of several molecular and signaling 
pathways occur due to the multifactorial effects pf breast cancer. Nevertheless, several potential therapies are 
needed for targeting multiple receptors and pathways to serve as an effective treatment against its proliferation 
and metastasis. [5] 

Breast Cancer being an important concern of health and cause of deaths due to cancer among women 
worldwide has shown increased trends of prevalence and incidence, resulting in a high mortality rate globally. 
There has been a sharp increase in breast cancer cases in recent past. [6] [7]. According to ‘Global Burden of 
Disease Cancer Collaboration’, 2017, the data showed an increment by 33% overall cancer cases from 2005- 
2015 worldwide. [8] [9]. The total cancer cases are likely to go up to 1,148,757 cases in 2020. [6] 

The age standardized prevalence data of cancer cases shows 97 per one lakh individuals according to 
recent trends. According to a report by WHO, the present data of 6% total deaths due to cancer in India might 
be harmful in near future which is estimated to show an increment up to 9 million deaths by 2030. [9]-[11] 

Improvements in cancer therapies have shown a remarkable success in recent past. Modern therapies 
have been developed which varies from DNA therapies based on DNA-excision repairing and modifying cell- 
apoptotic pathways etc. to other potential treatments like hormone-based therapies like tamoxifen (Estrogen 
synthesis blocking agent) has been developed for breast cancer. Monoclonal antibody based treatment, 
Immune-based therapies (ICIs, CART therapies) etc. are other potential therapies. [12] [13] 

Drug delivery systems like nanoparticles have also been developed in a past decade for targeted 
delivery in several cancers including breast cancer to avoid by-stander killing of tumor cells. [14] [15]. They 
include Doxorubicin, ThermoDox (Temperature based release of doxorubicin), SPIONs (Supermagnetic iron 
oxide nanoparticles), Nanotherm etc. [16] [17] 

In the recent Covid-19 pandemic due to novel coronavirus (SARS COV-2), started out around December 2019, 
there has been a declination in cancer research worldwide and a high chances of increasing comorbidities in 
patients during this era. Cancer patients have been found vulnerable to Covid-19 infection according to some 
Chinese studies. Moreover consequences like delayed treatment, distraction in healthcare for cancer patients 
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during pandemic conditions pose fatal consequences for cancer patients. [18]. Remarkable efforts have been 
made uptil now for overcoming the therapeutic and diagnostic challenges in this vulnerable population. 
According to recent safe guidelines of Canadian Association of Breast Imaging, delaying of breast cancer 
imaging and screening is needed to be avoid along with prevention of infection to the patients in Covid-19 
pandemic. [18] [19] 

Combinatorial therapies for cancer have been designed in many researches for overcoming 
challenges of cross-talk inhibition, signaling feedback etc. The rationale behind using multiple drugs is to 
suppress multiple receptors and pathways in a particular tumor to eradicate it in a synergistic way as compared 
with monotherapies. [20] [21]. Uptil now, different combinations of drugs including natural compound-based 
drugs, synthetic drugs, monoclonal antibodies etc. have been designed. [9] [22]. Examples include, USFDA 
approved Ipilimumab and nivolumab for unresectable melanoma. [23]. Combinatorial therapies have also been 
designed in Breast cancer. Eg. Combinations of Gemcitabine and Paclitaxel with Cyclophosphamide and 
Epirubicin in breast cancer. [24] , combination of Fulvestrant and Anastrozole in hormone receptor positive 
breast cancer. [25]. 

Natural compounds has attained enormous benefits in therapeutics like stability, enhancement in 
efficacy, reduced toxicity, bioavailability, etc. In cancer treatment, many natural compound-based medicines 
have been designed as monotherapeutics and in combinations as well as in combination with synthetic 
compounds. [26]—[28]. Examples- Cyclophosphamide and Paclitaxel (Synthetic) in combination with 
curcumin (Natural), has been used in MDA-MB-231, MCF-12F and MCF-7 cancer cell lines. [29]—[33]. 
Combinations of natural compounds in breast cancer have been documented in several researches. [34] [31]. 

Nanoparticles in conjugation with natural compounds and their combinations have been tailored in 
different cancers including breast cancer, to act as an efficient targeted drug delivery tool to enhance the 
specificity and preventing bystander killing. [14]-[17]. 

Breast Cancer as a prototype 

Breast Cancer has been found as the most frequently diagnosed cancer in India. The incidence rate 
has been estimated to 25.8/100000 individuals. In this project, estrogen receptor positive breast cancer was 
taken as a prototype. Estrogen receptor positive breast cancer accounts for approximately 80% breast cancer 
cases. Uptil now, various therapies have been designed for ER positive BC. [35]. They include, endocrine 
therapies under which GnRHa (Gonadotropin releasing hormone agonists), SERMSs (Selective estrogen 
receptor modulators, eg. Tamoxifen), SERDs (Selective estrogen receptor down-regulators, eg. Fulvestrant) 
in combination with AIs (Aromatase inhibitors) are given as combinatorial drug therapies. [36] [37]. 

Although a vast majority of receptors are associated with the expression of ER(+)BC, but in this 
research, some potential receptors of ER(+)BC in human were screened using a database which have been 
found to play an important role in its progression. Here, FGFR2 gene (encode protein from fibroblast growth 
receptor family), ESRI gene (encode estrogen receptor and a transcription factor), AR gene (encode androgen 
receptor), PIK3CA, PIK3CB and PIK3CD genes (encoding three catalytic isoforms i.e. p110a, p110B and 
p1106 of phosphatidylinositol 4,5-bisphosphate 3-kinase respectively) were taken into consideration. 

FGFRs play an important role in the progression of several cancers whose structures composed of 
different domains like 3 immunoglobulin-like domain, cytoplasmic tyrosine kinase domain, extra-cellular 
domain. [38]. FGFR2 has been reported as a crucial biomarker in breast cancer. In a research, its differently 
expressed genes (namely CXKCL8, BMP7, CD44, MMP9) have been found between FGFR2 silenced breast 
cancer cell lines. FGFR2 also plays an important role in the progression of ER(+)BC. N550K and M5381 
mutations of FGFR2 has been observed to offer resistance to CDK4/6, SERDs inhibitors in ER(+)BC patients. 
[38] [39] 

ESRI gene that encodes for estrogen receptor (ER) is an important biomarker in ER(+)BC which 

localizes to nucleus and forms homodimer or heterodimer with ESR2. [40]. In a research, patients with 
circulating ESR1 mutations have been found resistant to endocrine therapy which is given in ER BC treatment. 
In another research, these ESRI mutations have been validated using ct-DNA (circulating tumor DNA) 
detection method in ER(+)BC. [41] [42] 
Androgen receptor (encoded by AR gene) which have 3 functional domains and function as a transcription 
factor activated by steroid hormone act as a potential prognostic biomarker in breast cancers including 
ER(+)BC. [43]. Overexpression of AR as a key molecule has been seen in aromatase inhibitors (eg. 
exemestane) resistant ER(+)BC. Therefore, AR antagonism may serve as a promising strategy in ER(+)BC 
which may in turn target several AR associated pathways like ERK1/2 signaling pathway, PI3K pathways etc. 
[44], [45] 

The pathways of Phosphoinositide 3-kinase PIK (composed of a regulatory and catalytic subunits) 
shows an emerging prevalence and are responsible for frequent alterations in breast cancers including 
ER(+)BC. [46]. The three genes i.e. PIK3CA, PIK3CB and PIK3CD encode three catalytic isoforms i.e. 
p110a, p110B and p1106 respectively and treatments using RNAi has been designed to target these isoforms. 
[47]-[49]. PIK3CA mutations have been reported crucial in fulvestrant resistant ER(+)BC using circulating 
tumor DNA (ct DNA), therefore PIK3CA has been found as an important regulator in progression and 
migration in ER(+)BC. [50]-[53] 

In-silico approach in drug development 
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Currently, In-silico approach has emerged as a promising strategy which ensures fruitful insights in 
the various fields like drug designing, target profiling, drug repurposing, polypharmacology, etc. [54]. 
Conventional pipeline remain centered on time-taking experiments and high delivery cost. Computer aided 
drug discovery (CADD) on other hand has exhibited as an effective technology for significant drug designing 
at a faster and cheaper rate in several diseases including cancers. [55]—[57]. Virtual screening, High throughput 
screening, Molecular docking etc. are emergent techniques for screening of drugs against particular protein 
receptors involved in various diseases as well as their structural and binding analysis. [58], [59]. Many useful 
tools have been developed for virtual screening and CADD. Autodock 4 [60], Autodock Vina [60], PyRx [61], 
PyMol [62], etc. are some of the open source online softwares available for studying the affinities between 
ligands and proteins and their binding poses, crystallographic analysis, molecular dynamics evaluation etc. 

There are many databases available for ligand library screening like ZINC 15, PubChem, etc. 
IMPPAT (Indian Medicinal Plants, Phytochemistry and Therapeutics) is an online curated database 
(developed by The Institute of Mathematical Sciences, IMSc, India) which has many useful in-built 
webservers like admetSAR, FAF-Drugs4, FAF-QED, Open Babel, RDKit, etc. There is a huge library of 1742 
medicinal plants of Indian origin, 9596 phytochemicals present in this database which have been used in 
traditional Indian medicines since time immemorial. [63] 


2. MATERIALS AND METHODS 


1. Receptors selection 
510 genes associated with ER(+)BC were identified using DisGeNET, a curated database for genes 

associated with human diseases and variants. [64]. Then those 510 genes were filtered on the basis of Gene- 
disease association score (Scoreégda = 0.1), probability of being loss of function intolerant (pLI = 0.99), total 
no. of PMIDs supporting association (N. PMIDs = 7) and identified top hits were further evaluated. 

2. Protein-Protein Interaction study 
Types of protein-protein interactions and their interaction scores between the selected receptors were 
calculated using STRING (11.0), a database for identification of proteins and their functional interactions. 
[65]. The strength of associations among them was evaluated from the number of nodes, PPI enrichment value, 
number of edges, average node degree, average local clustering coefficient and expected number of edges. 

3. Binding site analysis 

Binding sites of receptors were identified using Discovery Studio Visualizer (DSV), a software for 
analyzing sequences, modeling molecular structures and other data. [66], [67]. It locates all surrounding amino 
acid residues of a bounded ligand to the protein within a distance under defined threshold and forms a 2-D 
image which depicts all the amino acid residues and type of bonds through which they were attached to the 
ligand. In this research, high resolution crystal structures of proteins with their previously experimentally 
identified inhibitors/drugs were used and their drug sites were evaluated using DSV to which our concerned 
ligand was supposed to bind. 

4. Protein Dataset 
Minimum resolution X-ray crystallographic structures of FGFR2 (PDB Id: 3RI1- catalytic domain of 
fibroblast growth factor receptor 2 kinase in complex with ARQO69, Resolution- 2.10 A, R-Value Free: 
0.247), ESR1 (PDB Id: 5ACC- Estrogen receptor | in complex with AZD9496, Resolution- 2.10 A, R-Value 
Free: 0.247), AR (PDB Id: 1E3G- Androgen receptor in complex with metribolone, Resolution- 2.10 A, R- 
Value Free: 0.247), PIK3CA (PDB Id: 3ZIM- Lipid kinase of Phosphatidylinositol 4,5-bisphosphate 3- 
kinase catalytic subunit alpha isoform in complex with covalent inhibitor C39 H35 N7 O3 S, Resolution- 
2.10 A, R-Value Free: 0.247), PIK3CB (PDB Id: 4BFR- Phosphatidylinositol 4,5-bisphosphate 3-kinase 
catalytic subunit beta isoform in complex with C19 Hz: N4 Os, Resolution- 2.10 A, R-Value Free: 0.247) 
and PIK3CD (PDB Id: 4XE0- Phosphatidylinositol 4,5-bisphosphate 3-kinase catalytic subunit delta 
isoform in complex with Idelalisib, Resolution- 2.10 A, R-Value Free: 0.247) were retrieved from PDB. 

5. Protein preparation 
Before docking of concerned ligands with selected proteins, proteins were prepared by deleting the hetatoms 
(co-crystallized water molecules, nonstandard residues, nonpolar hydrogens, small molecules and lone pairs) 
using DSV. Later gasteiger charges and polar hydrogens were added and merged using PyRx. Here, a file in 
pdbqt format is generated which is required for docking via Autodock. 

6. Ligand library preparation 

IMPPAT database was used for literature analysis of natural compounds using in-built webservers 
admetSAR, FAF-Drugs4, FAF-QED, RDKit. Initially all 9596 phytochemicals were primarily screened and 
filtered on the basis of zero Lipinski ROS violations, zero Oral PhysChem Score (Traffic lights) which 
indicates good oral bioavailability, good GlaxoSmithKline’s (GSK’s) 4/400 score which indicates log P 
(octanol-water partition coefficient) < 4 and My (Molecular weight) < 400 Dalton, Pfizer's 3/75 rule's good 
values which indicates small molecules with values of TPSA (Topological polar surface area) > 75 and logP 
<3 A? indicating more drug likeliness and less toxicity, Egan rule's good values (—1.0 <logP <5.8 and TPSA 
<130 A?) which indicates good oral bioavailability, veber rule’s good value (TPSA<140 and number of 
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rotatable bonds < 10) which satisfies good oral bioavailability, good QEDw (Quantitative estimate of drug- 
likeness) score ( > 0.8) which evaluates the weighted geometric mean of log P, number of hydrogen donors 
(<5) and acceptors (<10), molecular weight, No. Of rotatable bonds, TPSA, number of structural alerts and 
number of aromatic rings. QEDw = 0.8 signifies high druggability of compounds. Later, the primarily screened 
compounds were further secondarily screened on the basis of good PK properties like positive for BBB (Blood 
Brain Barrier) permeability, positive for HIA (Human intestinal absorption), good solubility. After screening, 
ligands were prepared finally for docking after energy minimization for 200 conjugate gradient steps with an 
UFF force field using PyRx. 
7. Molecular docking 
After secondary screening and preparation of ligands and receptors, they were subjected to structure- 
based drug designing through molecular docking using PyRx (with in-built Autodock tool). After selecting 
the binding sites manually, a grid box of dimension 60x60x60 A was generated with grid spacing 0.375 A 
which confines the active sites of protein to which the concerned ligand was supposed to bind within a specific 
shortened area. Then, after completion of docking within the specific sites, the binding energies i.e. AG (in 
Kcal/mol) were noted and the compounds were selected on the basis of top hits/ autodock vina good scores. 
It is considered that lower the AG, more effective the binding is (in the range of AG < -8.0 Kcal/mol). 
8. Screening of drug likeliness and ADMET properties evaluation 
The drug likeliness was analyzed using admetSAR open-source tool. The docked ligands having good 
AG were again filtered through extensive ADMET properties such as good PK and PD properties like positive 
BBB permeability, positive HIA, positive calcium carbonate permeability, Non-inhibition of P-glycoprotein 
protein substrate and inhibitors, Non-inhibition of CYP 450 (Cytochrome P450) substrate and inhibitors 
(including 2C9, 2D6, 3A4, 1A2, 2C9, 2D6, 2C19, 3A4), low CYP 450 inhibitory promiscuity (CYP IP), Non- 
ames test toxicity. P-glycoproteins and CYP enzymes substrate play a fundamental role in metabolism of 
drugs, so their inhibition affects the drug likeliness property of drugs. After assessing binding scores, important 
ADMET properties, obtained drugs were subjected to binding amino acid residues evaluation using PyRx and 
PyMol and proposed as combinatorial therapeutic agents. 
9. Binding amino acid residues evaluation 
After molecular docking and extensive ADMET analysis, the binding amino acid residues of finally 
selected docked ligands and the type of bonds through which they are attached were observed by generating 
a 2-D image using DSV. Later PyMol was used for depicting the bounded ligands in the binding pocket of 
receptors. 
Results and discussion 
In this research, natural compounds from Indian medicinal plants were screened against potent receptors 
of ER(+)BC. Further, they were assessed on the basis of good ADMET properties. The 7 top natural 
compounds having excellent ADMET properties were found to have multi-targeting effects with efficient 
binding energies and similar binding residues against the binding pockets of selected receptors in ER(+)BC. 
(Work flow in figure: 1). 
1. Receptors selection 
The associated genes for ER(+)BC were screened using DisGenet in which we found total 510 associated 
genes, out of which 6 top genes (FGFR2, ESR1, PIK3CA, PIK3CB, PIK3CD, AR) were filtered on 
the basis of parameters like pLI, Scoregda, N.PMIDs as mentioned previously. These genes are 
basically responsible for translation of mentioned receptors/proteins. Moreover, it was found that 
these six genes were of ‘kinase’ and ‘nuclear receptor’ protein class and have a large number of 
other associated diseases. Table showing 6 potent receptors of ER(+)BC assessed using DisGenet 
database. All six genes were found to have DSIg (Disease specificity index) > 0.29, out of which 
FGFR2 showed highest DSIg = 0.38. Their DPIg (Disease pleiotropy index) was found > 0.8, pLi 
0.99, Scoregaa> 0.1, EIgaa (Evidence index) > 0.9, although out of 6, only FGFR2 and ESR1 were 
found to have 2 and 1 SNPs ( Single nucleotide polymorphisms) respectively. (Table 1). So, on the 
basis of above parameters, these 6 receptors were found potent to regulate the biological behavior 
of ER(+)BC, therefore proposed for further evaluation. 
2. Protein- protein interactions 
Types and strength of protein-protein associations between the selected 6 receptors were assessed using 
STRING (11.0) database (Figure 2) which defines PPI (protein-protein interaction) with confidence 
ranges for obtained data scores (low confidence scores <0.4; medium confidence: 0.4-0.7; high 
confidence: >0.7). All 6 were found to have firm interactions among themselves with significant 
interaction scores as mentioned in Table 2. Moreover, 6 nodes, 10 edges, average node degree = 
3.33, average local clustering coefficient = 0.806, 2 expected number of edges and PPI enrichment 
p-value = 3.48e-05 were found. These values are strong enough to elucidate the significant 
interactions among receptors. Therefore, targeting these receptors was considered very useful in 
targeting associated pathways simultaneously for an effective multi-targeted effect of drugs. These 
receptors were further evaluated for molecular docking simulations. 
3. Protein retrieval and binding sites evaluation 
Binding sites or inhibitory sites of proteins after retrieving their crystal structures from PDB in complex 
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with different experimentally identified inhibitors were assessed through DSV. The surrounding 
amino acid residues (which are considered as binding sites) of bounded ligand were noted from 
DSV which generates a 2-D image of bounded inhibitor/ligand and its surrounding amino acid 
residues within a distance of threshold value. As shown in Figure: 3, FGFR2, ESR1, PIK3CA, 
PIK3CB, PIK3CD, AR were found to have 10, 15, 11, 10, 11, 10 residues respectively. 

4. Ligands screening 


In this research, natural compounds from medicinal plants of Indian origin were assessed through 
IMPPAT database of 9596 phytochemicals. Initially they were filtered using FAF-Drugs4 and FAF- 
QED open-source cheminformatics platforms on the basis of LRoS5 violations, Oral PhysChem score 
(traffic lights), GSK’s 4/400, Pfizer’s 3/75, Veber's rule, Egan's rule, QEDw as mentioned 
previously. At this stage, we found 99 natural compounds satisfying above mentioned properties. 
Then out of them, 42 Natural compounds were filtered on the basis of PK and PD properties (BBB 
permeability preferentialy) using admetSAR open source tool. (Table: 4-6) 
5. Molecular docking evaluations 
In this research, Structure-based Drug Designing (SBDD) through molecular docking was undertaken 
using PyRx (with in-built Autodock). After molecular docking of 42 ligands at the selected binding 
sites. Here, 25 natural compounds out of 42 were found to have promising binding energies in the 
range of AG < -8.0 kcal/mol with RMSD lower bound and RMSD upper bound values = 0. These 
25. selected compounds were further evaluated through extensive ADMET analysis using 
admetSAR. (Table: 3-6) 
6. Prediction of ADMET properties 
After getting the screening done on the basis of AG (in the range < -8 kcal/mol), 23 natural compounds 
were assessed for ADMET properties for identification of likelihood of compounds using 
admetSAR online server on the basis of CYP450 inhibitory promiscuity, ames toxicity, as 
mentioned before. Here, top 7 natural compounds were found to possess mentioned desired 
properties (preferentially CYP inhibitory promiscuity) as shown in table 6. On other hand, 
compounds like 4-Hydroxysesamin (CID: CID:16745513), Pinoresinol (CID:17750970), Irilone 
(CID:5281779) among 23 compounds were refuted besides having very good AG score, due to not 
fulfilling the criteria of CYP inhibition. (Table: 3-6). Then these 7 natural compounds were finally 
selected and subjected for binding amino acid residues evaluation using DSV and PyMol, within 
the particular domain (kinase or nuclear receptor domain) of concerned receptor. (Figure: 4-9) 
7. Molecular docking studies of receptors 
Having good binding efficiencies (indicating good stability and longer half-life) with 6 receptors, 
promising ADMET properties and significant number of common interacting amino acid residues 
as compared with binding sites of receptors, 7 natural compounds compounds whose 2-D structures 
were taken from PubChem as shown in Figure 6, were finally proposed on the basis of receptor 
based drug screening or Structure-based Drug Designing (SBDD). Here, we found that among top 
7 compounds, some of them showed highest binding energies (as shown in table 3), AG (in the range 
< -8 kcal/mol), as well as significant number of common amino acid residues with binding AA 
residues of protein. (Figure: 4,5,7-9). Besides this, it was also observed that some ligands like 
Paulownin-CID:3084131 which showed AG of -9.7 Kcal/mol with PIK3CD receptor and AG of - 
8.7Kcal/mol with PIK3CA receptor; Cornigerine-CID: 100188 which showed AG of -8.3 Kcal/mol 
with PIK3CD receptor were refuted as no common amino residues were found. The various types 
of molecular interactions (H-bonds, m-sigma, alkyl, 2-alkyl, a-Sulfur, etc.) among the ligand and 
binding residues were also studied as shown in Tables (7-12) . Finally proposed and their 
combinations are believed to be promising in inhibiting the potential receptors of ER(+)BC in a 
multi-targeted way. Furthermore, their synergistic effects can be evaluated further in in-vitro studies 
on ER(+)BC cell lines. 
3.Conclusion 
In this research, virtual screening of natural compounds found in Indian medicinal plants was done against 
the key receptors of Estrogen positive breast cancer. Ligand library was created using IMPPAT database 
followed by molecular docking simulations and ADMET predictions using PyRx and admetSAR tools 
respectively. We hereby conclude that the combination of Carpusin-CID:134369, Paulownin-CID:3084131, 
Cornigerine-CID:100188, Nororientaline-CID:440632, Oryzalexin B-CID:176496, Romucosine H- 
CID:5320994 and Colchicine-CID:6167 are potential natural compounds against key receptors in clinical 
expression of Estrogen positive breast cancer. Among them, Carpusin, Paulownin, Oryzalexin B and 
colchicine showed excellent ADMET properties. These seven compounds which possess good PK, PD, 
Physiochemical properties as well as high drug likeliness shows efficient binding with these key receptors. 
Therefore, they can be further explored for their synergistic efficiency in molecular dynamics studies to prove 
their potential to act as combinatorial therapeutic agents against estrogen receptor positive breast cancer. 
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Figure 1: Figure showing methodology of virtual screening used in research. 

Figure 2: Protein-Protein interactions of selected receptors for ER(+)BC using STRING (11.0) 
database. Colored lines depict edges and types of interactions as shown in key alongside. 

Figure 3: Binding sites of (A) FGFR2 (PDB Id: 3RI1- catalytic domain of fibroblast growth factor 
receptor 2 kinase in complex with ARQ069), (B) ESR1 (PDB Id: 5ACC- Estrogen receptor 1 in 
complex with AZD9496), (C) PIK3CA (PDB Id: 3ZIM- Lipid kinase of Phosphatidylinositol 4,5- 
bisphosphate 3-kinase catalytic subunit alpha isoform in complex with covalent inhibitor 
C30 H35 N7 O3 S), (D) PIK3CB (PDB Id: 4BFR- Phosphatidylinositol 4,5-bisphosphate 3-kinase catalytic 
subunit beta isoform in complex with C19 H22 N4 O3), (E ) PIK3CD (PDB Id: 4XE0- Phosphatidylinositol 
4,5-bisphosphate 3-kinase catalytic subunit delta isoform in complex with Idelalisib), (F) AR (PDB Id: 1E3G- 
Androgen receptor in complex with metribolone). 

Figure 4: Figure showing 2-D image of binding amino acid residues of ligands and 3-D molecular 
surface representation of binding pocket of FGFR2 receptor. Ligands are shown in stick format 
bounded in receptor’s pocket (Green cyan). (A) Paulownin-CID:3084131, (B) Cornigerine- 
CID:100188, (C) Oryzalexin B-CID:176496. 

Figure 5: Figure showing 2-D image of binding amino acid residues of ligands and 3-D molecular 
surface representation of binding pocket of ESR1 receptor. Ligands are shown in stick format 
bounded in receptor’s pocket (Green). (A) Carpusin-CID:134369, (B) Nororientaline-CID:440632, 
(C) Oryzalexin B-CID:176496. 

Figure 6: Chemical structures of natural compounds taken from PubChem, shown with their 
respective PubChem IDs and plants. (A) Carpusin-CID: 134369 found in Petrocarpus marsupium and 
Rheum austral, (B) Paulownin-CID:3084131 found in Gmelina arborea, (C) Cornigerine- 
CID:100188 found in Gloriosa superba, (D) Nororientaline-CID:440632 found in Erythrina fusca, 
(E) Oryzalexin B-CID:176496 found in Oryza sativa, (F) Romucosine H-CID:5320994 found in 
Annona cherimola, (G) Colchicine-CID:6167 found in Colchicum autumnale, Colchicum luteum, 
Crocus sativus and Gloriosa superba. 

Figure 7: Figure showing 2-D image of binding amino acid residues of ligands and 3-D molecular 
surface representation of binding pocket of PIK3CA receptor. Ligands are shown in stick format 
bounded in receptor’s pocket (Yellow). (A) Cornigerine-CID:100188, (B) Nororientaline- 
CID:440632, (C) Romucosine H-CID:5320994 

Figure 8: Figure showing 2-D image of binding amino acid residues of ligands and 3-D molecular 
surface representation of binding pocket of PIK3CD receptor. Ligands are shown in stick format 
bounded in receptor’s pocket (Grey). (A) Romucosine H-CID:5320994, (B) Colchicine-CID:6167 
Figure 9: Figure showing 2-D image of binding amino acid residues of ligands with their type of 
interactions and 3-D molecular surface representation of binding pocket of PIK3CB and AR 
receptors. Ligands are shown in stick format bounded in receptor’s pocket (Golden yellow- PIK3CB, 
Red- AR). (A) Carpusin-CID: 134369 with PIK3CB, (B) Carpusin-CID: 134369 with AR. 
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single nucleotide polymorphisms associated with gene). 
Table 2: Protein-Protein interaction score obtained from STRING (11.0) database. 
Table 3: Binding energy scores of 7 top ligands with selected receptors. Scores highlighted in red 
were selected as best ligands with top binding energies with significant number of common 
interacting amino acid residues with receptor’ binding sites. 
Table 4: Druggability properties of top 7 selected natural compounds. Computed via FAF-Drugs4 
and FAF-QED. 
Table 5: Physicochemical properties of top 7 selected ligands. Computed via FAF-Drugs4 and RDKit 
Table 6: ADMET analysis of top 7 selected ligands. Computed via admetSAR. 
Table 7: Types of molecular interactions of top 3 ligands with FGFR2. Binding amino acid residues 
highlighted in red indicate similarity with the binding sites of receptor. 
Table 8: Types of molecular interactions of top 3 ligands with ESR1. Binding amino acid residues 
highlighted in red indicate similarity with the binding sites of receptor. 
Table 9: Types of molecular interactions of top 3 ligands with PIK3CA. Binding amino acid residues 
highlighted in red indicate similarity with the binding sites of receptor. 
Table 10: Types of molecular interactions of top 1 ligands with PIK3CB. Binding amino acid residues 
highlighted in red indicate similarity with the binding sites of receptor. 
Table 11: Types of molecular interactions of top 2 ligands with PIK3CD. Binding amino acid residues 
highlighted in red indicate similarity with the binding sites of receptor. 
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12. Table 12: Types of molecular interactions of top 3 ligands with AR. Binding amino acid residues 
highlighted in red indicate similarity with the binding sites of receptor. 


Abbreviations: 


ER(+)BC: Estrogen Receptor Positive Breast CANCER 
WHO: World Health Organization 

DNA: Deoxyribonucleic acid 

ICIs: Immune Checkpoint Inhibitors 

CART: Chimeric Antigen Receptor T-Cell Therapy 
SPIONs: Supermagnetic iron oxide nanoparticles 

SARS COV-2: Severe Acute Respiratory Syndrome 
Coronavirus 2 

USFDA: United States Food and Drug Administration 
GnRHa: Gonadotropin releasing hormone agonists 
SERMSs: Selective estrogen receptor modulators 
SERDs: Selective estrogen receptor down-regulators 
FGFR2: Fibroblast Growth Factor Receptor 2 

ESRI: Estrogen Receptor 1 

PIK3CA: Phosphatidylinositol 4,5-bisphosphate 3- 
kinase Catalytic subunit alpha 

PIK3CB: Phosphatidylinositol 4,5-bisphosphate 3- 
kinase Catalytic subunit beta 

PIK3CD: Phosphatidylinositol 4,5-bisphosphate 3- 
kinase Catalytic subunit delta 

AR: Androgen Receptor 

CXCL8: C-X-C Motif Chemokine Ligand 8 


PMID: Unique identifier number used in PubMed 
STRING: Search Tool for the Retrieval of Interacting 
Genes/Proteins 

FAF-Drugs4: Free ADME-Tox Filtering Tool 

QEDw: Weighted quantitative Estimate of Drug-likeness 
DisGeNET: Database of gene-disease associations 
Scoregda: Gene-disease association score 

pLI: Probability of being loss of function intolerant 
N.PMIDs: No. of PMIDs) 

DSIg: Disease specificity index), 

DPIg: Disease pleiotropy index), 

Elgaa: Evidence index), 

EL gaa: Evidence level), 

N_diseases ,: No. of associated diseases to gene), 

N. SNPSgaai No. of single nucleotide polymorphisms 
associated with gene). 

DSV: Discovery Studio Visualizer 

TPSA: Topological polar surface area 

GSK: GlaxoSmithKline 

log P: octanol-water partition coefficient 

BBB: Blood Brain Barrier 

HIA: Human intestinal absorption 


: : ADMET: Absorption, Distribution, Metabolism, 
BMP?7: Bone Morphogenetic Protein 7 Excretion and Toxicity 
MMP%9: Matrix Metallopeptidase 9 CYP 450: Cytochrome P450 
CD44: A Multifunctional Cell Surface Adhesion DSjJg: Disease specificity index 
Receptor DPIg: Disease pleiotropy index 
ESR2: Estrogen Receptor 2 PPI: Protein-Protein Interaction 
CDK 4/6: Cyclin-dependent kinase 4/6 PK: Pharmacokinetic properties 
ct-DNA: circulating famOr DNA Lo, PD: Pharmacodynamic properties 
ERK1/2: Extracellular signal-regulated protein kinase '4 RMSD: Root Mean Square Deviation 
CADD: Comp ea aided drug discovery SBDD: Structure-based Drug Designing 
IMSc: The Institute of Mathematical Sciences LRoS: Lipinski Rule of five 
IMPPAT: Indian Medicinal Plants, Phytochemistry and LDS50: Lethal Dose, 50% 
Therapeutics , , 
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